The Loss Rank Principle for Model Selection

نویسنده

  • Marcus Hutter
چکیده

We introduce a new principle for model selection in regression and classification. Many regression models are controlled by some smoothness or flexibility or complexity parameter c, e.g. the number of neighbors to be averaged over in k nearest neighbor (kNN) regression or the polynomial degree in regression with polynomials. Let f̂ c D be the (best) regressor of complexity c on data D. A more flexible regressor can fit more data D′ well than a more rigid one. If something (here small loss) is easy to achieve it’s typically worth less. We define the loss rank of f̂ c D as the number of other (fictitious) data D ′ that are fitted better by f̂ c D′ than D is fitted by f̂ c D. We suggest selecting the model complexity c that has minimal loss rank (LoRP). Unlike most penalized maximum likelihood variants (AIC,BIC,MDL), LoRP only depends on the regression functions and the loss function. It works without a stochastic noise model, and is directly applicable to any non-parametric regressor, like kNN. In this paper we formalize, discuss, and motivate LoRP, study it for specific regression problems, in particular linear ones, and compare it to other model selection schemes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Model Selection with the Loss Rank Principle

A key issue in statistics and machine learning is to automatically select the “right” model complexity, e.g., the number of neighbors to be averaged over in k nearest neighbor (kNN) regression or the polynomial degree in regression with polynomials. We suggest a novel principle the Loss Rank Principle (LoRP) for model selection in regression and classification. It is based on the loss rank, whi...

متن کامل

Model Selection by Loss Rank for Classification and Unsupervised Learning

Hutter (2007) recently introduced the loss rank principle (LoRP) as a generalpurpose principle for model selection. The LoRP enjoys many attractive properties and deserves further investigations. The LoRP has been well-studied for regression framework in Hutter and Tran (2010). In this paper, we study the LoRP for classification framework, and develop it further for model selection problems in ...

متن کامل

A Multiple Objective Nonlinear Programming Model for Site Selection of the Facilities Based on the Passive Defense Principles

One of the main principles of the passive defense is the principle of site selection. In this paper, we propose a multiple objective nonlinear programming model that considers the principle of the site selection in terms of two qualitative and quantitative aspects. The purpose of the proposed model is selection of the place of facilities of a system in which not only it observes the dispersion ...

متن کامل

Using the Hybrid GA-TOPSIS Algorithm to Solving the Site Selection Problem in Passive Defense

One of the main principles of the passive defense is the principle of site selection. In this paper, we propose a multiple objective nonlinear programming model that considers the principle of the site selection in terms of two qualitative and quantitative aspects. The purpose of the proposed model is selection of the place of key production facilities of a system in which not only it observes ...

متن کامل

A Grey-Based Fuzzy ELECTRE Model for Project Selection

Project selection is considered as an important problem in project management. It is multi-criteria in nature and is based on various quantitative and qualitative factors. The main purpose of this paper is to present a new rank-based method for project selection in outranking relation. According to this approach, decision alternatives were clustered in the concordance matrix and the discordance...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007